AITopics | single-agent policy tree search

Single-Agent Policy Tree Search With Guarantees

Neural Information Processing SystemsNov-20-2025, 22:12:47 GMT

We introduce two novel tree search algorithms that use a policy to guide search. The first algorithm is a best-first enumeration that uses a cost function that allows us to provide an upper bound on the number of nodes to be expanded before reaching a goal state. We show that this best-first algorithm is particularly well suited for ``needle-in-a-haystack'' problems. The second algorithm, which is based on sampling, provides an upper bound on the expected number of nodes to be expanded before reaching a set of goal states. We show that this algorithm is better suited for problems where many paths lead to a goal. We validate these tree search algorithms on 1,000 computer-generated levels of Sokoban, where the policy used to guide search comes from a neural network trained using A3C. Our results show that the policy tree search algorithms we introduce are competitive with a state-of-the-art domain-independent planner that uses heuristic search.

algorithm, name change, single-agent policy tree search, (4 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.61)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Reviews: Single-Agent Policy Tree Search With Guarantees

Neural Information Processing SystemsOct-7-2024, 10:26:43 GMT

The paper presents two planning algorithms based on tree search. The novelty of these algorithms is the use of a policy based criterion instead of a standard heuristic to guide the search. The first algorithm (LevinTS) uses a cost function d(n)/\pi(n) which ensures nodes are expanded in a best-first order. This allows the authors to derive an upper bound to the number of expansions performed before to reach a target node. The second algorithm (LubyTS) is a randomized algorithm based on the sampling of trajectories.

algorithm, expansion, single-agent policy tree search, (10 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.73)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.52)

Add feedback

Single-Agent Policy Tree Search With Guarantees

Orseau, Laurent, Lelis, Levi, Lattimore, Tor, Weber, Theophane

Neural Information Processing SystemsFeb-14-2020, 12:11:55 GMT

We introduce two novel tree search algorithms that use a policy to guide search. The first algorithm is a best-first enumeration that uses a cost function that allows us to provide an upper bound on the number of nodes to be expanded before reaching a goal state. We show that this best-first algorithm is particularly well suited for needle-in-a-haystack'' problems. The second algorithm, which is based on sampling, provides an upper bound on the expected number of nodes to be expanded before reaching a set of goal states. We show that this algorithm is better suited for problems where many paths lead to a goal.

algorithm, single-agent policy tree search, tree search algorithm, (2 more...)

Neural Information Processing Systems

Genre: Research Report (0.43)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Single-Agent Policy Tree Search With Guarantees

Orseau, Laurent, Lelis, Levi, Lattimore, Tor, Weber, Theophane

Neural Information Processing SystemsDec-31-2018

We introduce two novel tree search algorithms that use a policy to guide search. The first algorithm is a best-first enumeration that uses a cost function that allows us to provide an upper bound on the number of nodes to be expanded before reaching a goal state. We show that this best-first algorithm is particularly well suited for ``needle-in-a-haystack'' problems. The second algorithm, which is based on sampling, provides an upper bound on the expected number of nodes to be expanded before reaching a set of goal states. We show that this algorithm is better suited for problems where many paths lead to a goal. We validate these tree search algorithms on 1,000 computer-generated levels of Sokoban, where the policy used to guide search comes from a neural network trained using A3C. Our results show that the policy tree search algorithms we introduce are competitive with a state-of-the-art domain-independent planner that uses heuristic search.

artificial intelligence, machine learning, node, (18 more...)

Neural Information Processing Systems

Country: